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Abstract — As in the absence of proper technology lot of time is 
waste in the identification of brain signal as seizure and 
non-seizure. Generally a lot of tests are performed to catch the 
disease or to actually know whether the patient is cured or 
healthy. These tests results in congregate or cluster of huge 
number of records. Whereas many diagnostic process could 
result in the mesh up of the actual diagnosis process and create 
difficulty in obtaining the genuine result specially when there is 
lot of test performed. These problems could be neutralized by 
using classifiers for the classification of record. So there are lot 
of classifiers are available called as SVM (square vector 
machine), k-NN (k- nearest neighbours), discriminate classifier 
and many like these. In this study we gave a resemblance of the 
classifiers on the basis of their accuracy sensitivity and 
specificity. 

Index Terms — SVM - support vector machine, k-NN - 
K-nearest neighbour, EEG - electroencephalography, EMD - 
Empirical mode decomposition, IMF - Intrinsic mode function 


I. Introduction 

The objective of this study is to discover the behaviour of 
varying classifier with help of MATLAB tool on EEG data 
set of five different persons with different attribute. The 
devastated problem in the analysis of neural signals or brain 
signals is to optimize the correct diagnostic process of certain 
useful knowledge. For better treatment, many processes are 
followed that results in clustering of huge data and 
performing this whole process is necessary for the sake of the 
effective diagnosis. However at other side of coin, 
performing these diagnosis result in the colossal collection of 
diagnostic records which makes the treatment hectic and 
make us unable to conclude the final result. These type of 
problems can be cured by having the knowledge of the 
classifiers technique which could further lead to extract final 
report with the help classifier. So we have number of 
classifiers which we can use generally known as SVM 
(square vector machine), k-NN (K- nearest neighbours), 
discriminate classifier and many like these. In this study we 
gave a resemblance of the classifiers on the basis of their 
accuracy sensitivity and specificity. Classifier covers a huge 
range of procedure which is impossible to define without 
vagueness. The pluck out of necessary records from massive 
collection of data and its concurrence is often beneficial using 
classifier. Our objective of this work is to analyse the 
behaviour of varying classifiers for a collection of large data. 
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EEG data set of 5 random people with different attribute is 
used in this work to clarify the variation between the 
classifiers. Therefore the classifier with best accuracy, swift 
process and with large potential will be propound for the 
classification of huge number of records or data set of neural 
signals or brain signals or for other general application. The 
three classifiers that we used (i.e. SVM (support vector 
machine), k-NN (K-nearest neighbour) and discriminate 
classifier) are well defined in MATLAB. The data we have 
used in this work have 100 data values in all 5 sets named Z, 
O, F, N and S. These data sets are publically available and 
used by many scholars for their research so we have well 
defined results for our demonstration. Apart from these 
classifiers our work involved EMD algorithm along with AM 
(amplitude modulation) and FM (frequency modulation) 
feature extraction technique. This feature extraction 
technique is already used by our honourable scholars Mr. 
Ram Vilas Pachori and Mr. Varun Bajaj in a paper title 
“classification of seizure and non-seizure EEG signal using 
EMPIIRICAL MODE DECOMPOSITION which is also 
involved the data set that we have used in this work. The 
paper is further assigned as: data set description, feature 

extraction method and parameters, description of classifiers 
and comparison. Finally at last conclusion of the paper. 

n. DataSet 

The data set used in this work is known as bonn data set 
which is publically available online[l]. In this data set used 
here has 5 subset i.e. Z, O, N, F and S each having 100 single 
channel EEG signals of 23.6 seconds duration each. The 
signals are picked from sequential multichannel EEG signal 
taken from visual inspection for article facts. These subsets 
have different attributes some are extra-cranially recorded 
like Z and O whereas some are recorded intra-cranially like 
N, F and S. Extra-cranial recordings are acquired from five 
diseased free person with their eyes open and closed in order 
from surfaced EEG recording. The subset F have been 
acquired in non-ictal recordings from five volunteers in the 
epileptogenic zone. Whereas, subset N acquired from the 
hippocampal formation of the opposite part of the brain. The 
final subset have recorded some ictal activities therefore the 
subset S contains recording of seizure signals. That means we 
have only one seizure signal subset and four different seizure 
free subsets whose sampling frequency (fs) is 173.61 Hz. In 
our paper we have made two classes one is ictal containing 
only S subset and other is ictal free containing 4 subsets Z, O, 
N and F and the Fig contains recordings from each signal. 
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III. Related work 

The encephalography has undergone massive progress 
during 100’s of year. The existence of electrical currents in 
the brain was discovered in 1875 by an English physician 
Richard Catton. In 1924 Hans Berger, a German neurologist, 
used ordinary radio equipment to amplify the brain's 
electrical activity measured on the human scalp. It is a 
neurological disorder which effects about 1% of world’s 
population. There are almost 1% of the world’s population is 
suffering from epilepsy involving brain tumor, brain injury, 
strokes and substance of anarchy. There are numerous work 
available or carried out on the diagnosis of various kind of 
disease like the work done by R. B. Pachori and Varun Bajaj 
gives a technique for the classification of EEG signal, their 
work is based on the extraction of signal using EMD 
method [2] and then processed the signal by applying AM and 
FM parameter and finally they classify the extracted signal 
using LS-SVM i.e. least square support vector machine[3]. R. 
B. Pachori has numerous work on diagnosis of EEG signal by 
using different parameters. The methodology used is 
classification using fractional linear prediction, local binary 
patterns, study based on phase space representation of 
IMF[4]. Similarly apart from using EMD as a filter time 
frequency analysis is also carried out by Alexandras T. 
Tzallas, Member, IEEE, Markos G. Tsipouras, and Dimitrios 
I. Fotiadis, Senior Member, IEEE[5] under title Epileptic 
seizure detection in EEGs using time-frequency analysis, in 
this work EEG signal is extract out using many time 
frequency based algorithm and finally classified using ANN 
classifier by dividing the data into three classes class I Z and S 
subset class II Z, N and S and class III includes all the data 
sets Z, O, N, F and S and the average accuracy obtain for each 
class is 94.27, 94.68 and 80.33 respectively. In our work we 
have the data into several classes and processed them by 
using different classifiers and tried to obtain better accuracy, 
sensitivity and specificity. 

IV. Methodology 

A. Empirical Mode Decomposition[ 2 ] 

The empirical mode decomposition method is highly 
preferred as it is a flexible and data dominated process and it 
does not need any requirements like linearity and signal’s 
stationarity. As the result of this method, the non-linear and 
non-stationary signal x(t) is decompose into the sum of 
intrinsic mode function. There are several earmark extraction 
method are proposed using EMD[8], in all these methods 
firstly the EMD of each signal is classified along with IMF 
(Intrinsic Mode Function) of each signal then various 
methodology is used to further classify these signal and to 
make categorization easy. 

EMD algorithm [9] for signal x (t) can be defined as- 


+■? [ ( n! 

a(t) = i 

Extract IMF ^1 '■^ l = x(t) - a(t). 

Now we applied Hilbert transform on all the IMF obtained 
by repeating above algorithm. The analytic signal z(t) of any 
real IMF is defined as - 

z(t)= A(t)S“ J0<f) (1) 


Where, 

A(t) = signal amplitude 

B. By analysis of Amplitude Modulation and Frequency 
Modulation bandwidth [6] 

The EEG record is decomposed by using Empirical mode 
decomposition and its IMF is obtained by using above 
algorithm. Then the bandwidth of the signal is estimate of the 
expansion in frequency for the time period of records use, this 
spread in frequency is due to aberration from the average 
frequency or due to differences in amplitude and blend of the 
one and the other. To appraise amplitude modulation 
bandwidth and frequency modulation bandwidth [10], first 
we appraise the centre frequency of IMF as follows - 



( 2 ) 


Where 

W = centre frequency 
E = energy signal 

The band width of analytical imf is defined as- 


B z = z f (u- — {w) 1 Z (w ) | : dw 


( 3 ) 


It can be further expressed as - 




4 3 C t) 


It shows that the signal’s bandwidth has some terms, 
depending on extent and phase respectively. Therefore 
bandwidth by virtue of amplitude modulation and by virtue of 
frequency modulation are defined as - 



i r y 

Therefore the total bandwidth is given as - 

B = V B am + B fm (5) 

Later on, LSSVM (least square support vector 
machine)[3], k-NN and discriminate classifier is used to 
evaluate the effectiveness of the bandwidth parameters to 
detect ictal and ictal free EEG records. 


• From the given set of EEG records stratify the 

maxima and minima. 

• By merging maxima and minima independently, 

engender upper and lower envelopes. 

Appraise sectional average as - 


V. TRAINING AND CLASSIFICATION 

Different types of method are implemented which 
combines features and classifiers. The author approaches the 
multi-class problem as a set of classification problem in such 
a way one can assemble together diverse features and 
classifiers approaches custom-tailored to parts of the 
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problem, which handles a simple three class problem. One 
classifiers consisting of three classes each classes will be used 
as base learner, and each classifier will be trained images, 
Each class will receive a unique ID. 

A. Classification of data 

Stratification of remotely discerned signal records is used 
to earmark homogeneous levels with compare to groups with 
kindred characteristics, with the objective of perspicacious 
multifarious objects from each other within the data. Class 
denotes level. Classification will be done on the basis of 
spectral or spectrally defined features, such as destiny, texture 
etc. in the feature space, it can be said that classification 
separates the feature space into several classes based on a 
decision rule. 

Common classifier approaches that we are used in this 
work for the classification of our EEG data are as follows :- 

a) Support Vector Machine classifier 

Support Vector Machine (SVM) is a supervised learning 
algorithm[7] developed by Vladimir Vapnik and it was first 
heard in 1992, introduced by Vapnik, Boser and Guyon in 
COLT-92[8]. For many years Neural Networks was the 
ultimate champion, It was the most effective learning 
algorithm. SVM has so many useful application in real world 
problems which can be defined as text and image 
classification, hand-writing recognition, data mining, 
bioinformatics, medicine and bio sequence analysis and 
seven stock market.The Support vector machine (SVM) use 
to determine a separating hyperplane to identify different 
classes of data to maximize the margin and minimize the 
categorization error. By this methodology we determine the 
ictal and ictal free signal, as in non- seizure signals, it is 
observed that the changing rate of amplitude envelops of 
IMFs is large in number and the amplitude modulation 
bandwidth is larger with respect to the IMFs of seizure EEG 
record. Whereas the changing rate of frequency modulation 
components of IMF are less in number in seizure EEG 
records and the value of frequency modulation bandwidth is 
lower with respect to the IMFs of non- seizure 
signals. Therefore we can conclude that the total bandwidth of 
the IMFs of ictal EEG record is smaller as compares to the 
IMFs of the non-ictal EEG records. 

b) K-nearest neighbor (k-NN) classifier 

In pattern recognition, the k-Nearest Neighbors 
algorithm (or k-NN for short) [9] is a parameter used 
for regression and classification. In all conditions, the input is 
inclusion of the k nearest training examples in the feature 
space. The output rely on whether k-NN is considered for 
classification or regression: 

• Class membership is the output in k-NN 

classification. Any object is classified on the basis 
of majority support of their neighbors, with the 
object being assigned to the class most common 
among its k nearest neighbours (k is a 
positive integer, typically small). If k = 1, then the 
object is easily entrust to the class of that single 
nearest neighbour. 

• For the object the outcome is the property value 


in k-NN regression. This value is the average of 
the values of its k nearest neighbors. 

k-NN is a type of instance-based learning, or lazy learning, 
where the function is only approximated locally and all 
computation is deferred until classification. The k-NN 
algorithm is among the simplest of all machine 
learning algorithms. 

c) Discriminate Classifier 

Gene expression data on p genes for n tumour mRNA 
samples may be summarized by an n x p matrix X D 4xij5, 
where xij denotes the expression level of gene (variable) j in 
mRNA sample (observation) i. The expression levels might 
be either absolute (e.g., oligonucleotide arrays used to 
produce the leukaemia dataset) or relative to the expression 
levels of a suitably denotes need common reference sample 
(e.g., the lymphoma and NCI 60 data sets are produced using 
cDNA micro arrays). The samples of mRNA belongs to the 
same identified classes(e.g., follicular lymphoma), the data 
for each observation consist of a gene expression profile xi D 
4xi 11::: 1 xip5 and a class label yi, that is, of predictor 
variables xi and response yi. For K tumour classes, the class 
labels yi are denotes need to be integers ranging from 1 to K, 
and nk denotes the number of observations belonging to class 
k. Note that the expression levels xij are in general highly 
processed data; the raw data in a microarray experiment 
consist of data profiles, and important pre-processing steps 
include data analysis of these processed data’s and 
normalization. The data that is available publically, ‘n’ the 
number of tumours is hardly below 100 and on the other 
hand, 4 p’ the number of genes is in number of thousands. In 
the comparison of prediction methods, the number of genes 
will be substantially reduced by identifying a subset of genes 
whose expression levels are associated with tumour class. 

VI. Result 

After applying these methodology the output obtained will 
be as 8 data values of bandwidth of amplitude modulation and 
8 data values is of bandwidth of frequency modulation which 
is then concatenate to form a 100 x 16 matrix from each 
subset and then from this matrix different data for testing and 
training is selected for e.g. if we have processed F subset 
from the above methodology and obtained 100 x 16 data 
matrix then from this matric we will select 80 x 16 data as 
data for training purpose and 20 x 16 data for testing purpose. 
Same technique is performed on the other subsets too for 
obtaining different training and testing data as 80 % , 70 % , 
60 % and 50% and then classified the data using different 
classifiers i.e. SVM with different kernel function then by 
k-NN classifier and finally with discriminate classifier and 
obtain its accuracy error sensitivity and specificity of 
different combination of seizure and non-seizure (seizure 
free) sub set i.e. different ‘class’ starts with FS, NS, FNS, ZS, 
OS, ZOS and ZONFS and tried to obtain the best accuracy 
among all and also defined the average accuracy of all the 
classifier so that we can identify the best classifier and then 
that can be used in future without any doubt. The 
methodology used in this paper breaks nonlinear and 
non-stationary signal into a set of AM-FM component of 
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narrow band by using the EMD algorithm. After applying 
EMD methodology we obtain the AM-FM IMFS which 
smooth the way for calculation of bandwidth. After obtaining 
the bandwidth due to frequency modulation and amplitude 
modulation, we have processed that data using various 
classifiers i.e. SVM, k-NN and discriminate classifier. Obtain 
the accuracy, error rate, sensibility and seizure dataset and 
with three set of training data as 80%, 70%, 60% and 50%. 
The accuracy of each combination of dataset with different 
training data is shown using bar graph as in Fig 2, 3 and 4. As 
shown in Fig below it can be easily said that the SVM 
classifier with highest accuracy of 97.5% and with lowest 
accuracy of 65% is best among all classifiers. Whereas K-nn 
classifier is worst among all as its highest obtained accuracy 
for all class and training data is 81% only also its lowest 
accuracy is 53.75%. 


ACCURACY FOR DISCRIMINATE CLASSIFIER 



FS N5 FNS ZS OS ZOS ZOFNS 

CLASS 

Fig 3: Accuracy of combination of different data sets 

using discriminate classifier 


ACCURACY FOR SVM 



FS MS FNS ZS OS ZOS ZONFS 


CLASS 


Fig 1 : Accuracy of different combination of datasets using 
SVM classifier 


ACCURACYFOR KNN CLASSIFIER 


Table 1 Error rate using different kernels in SVM for 80 % 
trainning data and FNS class 


Sr. no. 

Classification 
machine used 

Kernel 

function 

Accuracy 
(in %) 

Error 
rate 
(in %) 

1 . 

SVM 

Finear 

95 

5 

2. 

SVM 

RBF 

86.67 

13.33 

3. 

SVM 

Quadratic 

85 

15 

4. 

SVM 

Polynomial 

88.33 

11.67 

5. 

SVM 

MFP 

83.37 

16.67 



FS NS FNS ZS OS ZOS ZOFNS 

CLASS 

Fig 2: Accuracy of different combination of datasets 
using KNN classifier 
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Fig 4: Confusion matrix of FS class 
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On the other hand discriminate classifier or classify 
classifier is much accurate than K-nn whereas its accuracy is 
little less than SVM. As it has its highest accuracy of 
93.33classification of neural signal of the detection of seizure 
and non-seizure. Therefore SVM classifier best among all the 
classifier used hence SVM classifier will all its kernel 
function and a training data of 80 and a class FNS is shown in 
table 1 . The best accuracy obtain with linear kernel function 
whereas worst accuracy is obtain with MLP kernel function. 
We can verify the above result with the help of confusion 
matrix also as the ratio of true positive and true negative is 
best for SVM classifier class FS for 80% training data. The 
ratio obtain is 19:1 and also the best result for false positive 
and false negative is also for same i.e. 20:1 as shown in Fig 4 
of confusion matrix of class FS. 

VII. Conclusion 

By observing the above results it can be conclude that the 
beat classifier that can be used for the classification of neural 
signals for the detection of seizure and non-seizure is SVM 
classifier with outstanding accuracy up to 97.5% and with 
good error rate 2.5%. The above result is the outcome of the 
various combination of dataset i.e. class and the best result is 
obtained with FS class using SVM for 80% training data 
value and the accuracy and error obtained is 97.5 % and 2.5 % 
respectively. Whereas for other classification the best result is 
obtained for Classify discriminate classifier is 93.33 % 
accuracy for FNS class with 80 % training data and sensitivity 
and specificity of this class is 86.36 and 97.37 respectively. 
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